A Hybrid Deep Learning Approach for Deepfake Image Detection Using CNN and MTCNN-Based Facial Feature Extraction

Authors: Varanasi Triveni, Umamaheswararao Mogili, J. Vani, K. Revathi, G. Jyothi Prakash, Ch. Kiran Kumar, P. Chakri

DOI Link: https://doi.org/10.22214/ijraset.2026.78206

Abstract

The rapid growth of digital media and social networks has led to a surge in the creation and dissemination of manipulated media, commonly known as deepfakes. Detecting deepfakes is increasingly important for ensuring digital trust, privacy, and security. In this study, we propose a deep learning-based approach for deepfake detection using Convolutional Neural Networks (CNN) for classification and Multi-task Cascaded Convolutional Networks (MTCNN) for accurate face detection and alignment. The CNN model is trained on a curated dataset of real and manipulated images, while MTCNN ensures proper face preprocessing for improved detection performance. The trained model is deployed through a Streamlit web application, featuring login, registration, detector, and about pages, providing a user-friendly interface for real-time inference. The model achieved an accuracy of 46%, reflecting challenges in detecting subtle manipulations with limited data. Despite this limitation, the integration of CNN and MTCNN within a web-based interface demonstrates a practical framework for deepfake detection. Future improvements may include leveraging larger datasets, pretrained models, and ensemble techniques to enhance detection accuracy and generalization.

Introduction

The text focuses on the growing threat of deepfakes, which are AI-generated manipulated images and videos that can appear highly realistic. While they have useful applications in entertainment, they also pose serious risks such as misinformation, identity fraud, and privacy violations. Detecting deepfakes is challenging due to subtle visual changes and variations in lighting, pose, and image quality.

To address this, the study proposes a deep learning-based detection system that combines Convolutional Neural Networks (CNNs) for feature extraction and classification with MTCNN for accurate face detection and alignment. The system ensures consistent input data and improves detection performance.

The methodology includes:

Data collection from benchmark datasets (FaceForensics++ and DFDC)
Preprocessing (face detection, alignment, cropping, normalization)
Feature extraction and classification using CNN layers
Model training with standard optimization techniques
Deployment via a user-friendly Streamlit web application for real-time detection

The system allows users to upload images or videos and receive instant predictions (Real or Fake) with confidence scores.

Results show that while preprocessing improves consistency and usability, the model achieves relatively low accuracy (around 46%), indicating the difficulty of detecting subtle deepfake manipulations. The system is functional and accessible but requires improvements such as larger datasets, better models, and enhanced techniques for video analysis.

Conclusion

The proposed methodology demonstrates an end-to-end approach for deepfake detection, utilizing MTCNN for accurate face detection and alignment, and a CNN for hierarchical feature extraction and classification. The system effectively distinguishes between real and manipulated images, providing predicted labels and confidence scores to the user through a Streamlit web interface. Although the model achieved an overall accuracy of 46%, the methodology highlights the importance of consistent preprocessing, robust feature extraction, and real-time deployment for practical applications. The results indicate that the model captures key facial patterns, but challenges remain in detecting subtle manipulations, which may require larger datasets, ensemble models, or temporal analysis for video-based detection. The Streamlit interface demonstrates the usability of the framework, allowing users to interactively test images and videos, while the display of confidence scores provides insight into model reliability, even though the predictions are not stored. Overall, this study establishes a foundational framework for real-time deepfake detection, emphasizing both the potential and limitations of current methods, and offers directions for future research to enhance accuracy, generalization, and practical deployment in real-world scenarios.

References

[1] Mogili, U., Ampolu, K. V., Rajasekharam, B., & Timothy, M. J. AI-Driven Interaction in AR Environments, in Journal of Digital Economy, 2024, Volume 3, Issue 1, pp. 228-234. [2] Timothy, M. J., Rajasekharam, B., Ampolu, K. V., & Mogili, U. Threat Detection Using AI in Cybersecurity Systems, in IJIS, 2023, Volume 7, Issue 1, pp. 1-7. [3] Ampolu, K.V., Mogili, U., Timothy, M. J., & Rajasekharam, B. Machine Learning Models for Predictive Maintenance, in IJIS, 2022, Volume 6, Issue 4, pp. 1-7. [4] Rajasekharam, B., Timothy, M. J., Mogili, U., Ampolu, K.V., Machine Learning Models for Predictive Maintenance, in JDE, 2023, Volume 2, Issue 2, pp. 95-101. [5] Soujania, B., Ampolu, K. V., Timothy, M. J., & Mogili, U. (2025) Classifying Disease Information Forums through Semantic Similarity-Based Machine Learning, Science, Technology and Development Journal, Volume XIV, Issue II, pp 67-75. [6] B Satish Kumar, Kavitha C., Mogili, U.R., S. Pallam Shetty (2022). “Application of Machine Learning To Enhance the Performance of The Prophet Routing Protocol For Delay Tolerant Networks”. Journal for Basic Sciences, Volume 23, Issue 5, 2107-2116, DOI:10.37896/JBSV23.5/2278. [7] I. Sree Geeta, Umamaheswararao Mogili. (2022), “Use of Several Machine Learning Algorithms for Effective Prediction of Cyberbullying”, International Journal of Creative Research Thoughts, Volume 10, Issue 6, pp 17. [8] Mogili, U., & Mohamed, A. (2023, November). Artificial intelligence and machine learning in the fields of education, medical, and smart phones. In AIP conference proceedings (Vol. 2917, No. 1, p. 050012). AIP Publishing LLC. [9] Esram, R., Deepak, B. B. V. L., Mogili, U. R., & Syam Sundar, P. (2022). Agribots concepts and operations—a review. Applications of Computational Methods in Manufacturing and Product Design: Select Proceedings of IPDIMS 2020, 31-40. [10] A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Niessner, “FaceForensics++: Learning to detect manipulated facial images,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 1–11. [11] A. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, and C. Ferrer, “The deepfake detection challenge (DFDC) dataset,” arXiv:2006.07397, 2020. [12] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, 2016. [13] I. Goodfellow et al., “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014, pp. 2672–2680. [14] J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner, “Face2Face: Real-time face capture and reenactment of RGB videos,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2387–2395. [15] H. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, “MesoNet: a compact facial video forgery detection network,” in Proc. IEEE Int. Workshop Inf. Forensics Security, 2018, pp. 1–7. [16] Y. Li, M. Chang, and S. Lyu, “In ictu oculi: Exposing AI-created fake videos by detecting eye blinking,” in Proc. IEEE Int. Workshop Inf. Forensics Security, 2018, pp. 1–7. [17] Z. Dang, X. Wu, and Y. Yu, “Detection of Deepfake videos based on facial landmark motion patterns,” IEEE Access, vol. 8, pp. 178790–178799, 2020. [18] S. Agarwal, A. Farid, M. Gu, H. He, and M. Nagano, “Detecting deep-fake videos from appearance and behavior,” arXiv preprint arXiv:2002.04238, 2020. [19] Y. Guera and E. J. Delp, “Deepfake video detection using recurrent neural networks,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process., 2018, pp. 2507–2511. [20] J. Korshunov and S. Marcel, “DeepFakes: a new threat to face recognition? Assessment and detection,” arXiv preprint arXiv:1812.08685, 2018. [21] M. Rossler, A. Cozzolino, and L. Verdoliva, “FaceForensics: A large-scale video dataset for forgery detection in human faces,” arXiv preprint arXiv:1803.09179, 2018. [22] S. K. Kundu, S. D. Roy, and S. Saha, “Real-time deepfake detection framework using CNN and OpenCV,” Int. J. Comput. Vision, vol. 128, pp. 2107–2120, 2020. [23] M. Holtz and P. Mittal, “Streamlit for machine learning and data science: building interactive web apps,” arXiv preprint arXiv:2009.10778, 2020. [24] S.S.D.K. Maha Lakshmi, Umamaheswararao Mogili, Sravya Eluri, Dogga Ramachandra Rao. (2023), “Online Dynamic Out Patient Queue System for Automated Token Generation in Hospitals”, Science, Technology and Development Journal, Volume XII, Issue VII, pp 71-78, DOI:23.18001.STD.2023.V12I07.23.37707.

Copyright

Copyright © 2026 Varanasi Triveni, Umamaheswararao Mogili, J. Vani, K. Revathi, G. Jyothi Prakash, Ch. Kiran Kumar, P. Chakri. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET78206

Publish Date : 2026-03-12

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here